DDMCPP: The Data-Driven Multithreading C Pre-Processor
نویسندگان
چکیده
Single thread performance improvement using more complex structures and higher frequencies is currently reaching its limits. As such, several architectures that target this problem through exploiting coarse grainedmultithreading have been proposed. Such an architecture is the DataDriven Multithreading Chip Multiprocessor (DDM-CMP) which is based on a dataflow-like model of execution. To fully explore the benefits of a new architecture an automated compiler is required. As a fully automated tool requires tremendous effort, semi-automated solutions are often the first approach. In this paper we present the DDM C Pre-Processor (DDMCPP), a specialized tool that allows easy programming for the DDM-CMP architecture. The only responsibility of the programmer is to express the available parallelism using special directives. DDMCPP then coverts the code into a C program able to compile for the DDM-CMP architecture using a commodity compiler.
منابع مشابه
Speculative Data-Driven Multithreading
Mispredicted branches and loads that miss in the cache cause the majority of retirement stalls experienced by sequential processors; we call these critical instructions. Despite their importance, a sequential processor has difficulty prioritizing critical computations (computations of critical instructions), because it must fetch all computations sequentially, regardless of their contribution t...
متن کاملData-Driven Multithreading Programming Tool-chain
The increasing parallelism offered by the parallel architectures introduced by processor vendors, coupled with the need to extract more parallelism out of the applications, has led the community to examine more efficient programming and execution models. The Dataflow Multithreading model is known to be the model that can exploit the most parallelism out of a wide range of applications. The Data...
متن کاملPre-execution via Speculative Data-driven Multithreading
This dissertation introduces pre-execution, a novel technique for accelerating sequential programs. Pre-execution directly attacks the instructions that cause performance problems—mis-predicted branches and cache missing loads. In preexecution, future branch outcomes and load addresses are computed on the side and the results are fed to the main program. In doing so, the main program is spared ...
متن کاملModeling and Analysis of Simultaneous Multithreading
In simultaneous multithreading, several threads can issue instructions in each processor cycle. A simple and versatile timed Petri net model of simultaneous multithreading is proposed and is used to compare the performance of architectures with and without simultaneous multithreading. Performance results are obtained by event-driven simulation of net models and are verified by state–space–based...
متن کاملDatarol: A Parallel Machine Architecture for Fine-Grain Multithreading
In this paper, we discuss the design principle of massively parallel distributed-memory multiprocessor architecture, and introduce the Datarol-II machine architecture. We present the Datarol-II processor design, including communication protocol and handling mechanisms of remote memory access remote process/procedure invocation. Several evaluation data of the Datarol-II processor are shown from ...
متن کامل